Pupularity Contest¶

'WeRateDogs' is a twitter comedy account that rates people's dogs with funny nicknames and witty comments. Here is an example of a very cute pupper getting their first rating:

This is Willow. You asked for a high five. But she is very small. Hopes a medium to low five is okay too. 12/10 pic.twitter.com/s7yN30bc2b

— WeRateDogs (@dog_rates) May 1, 2023

image.png

Since being created in 2015 by Alex Nelson, the account has acquired over 8 million followers and has been featured in the Huffington Post, BuzzFeed, and NPR. Alex has successfully turned this internet fame into an internet career with merch, a calender, a book, partnerships, and even a mobile game. Good job Alex! But wait, how can I become an internet star and what makes a good tweet anyways?

WeRateDogs is likely popular because of two things: it is funny and people love a good dog picture. Comedy is one of the harder things to break down into its component parts, with both philosophers having a hard time understanding what makes things funny and neural networks having a difficult time generating good comedy. However, WeRateDogs' tweets have some reoccurring jokes that may explain some of the appeal: its ratings and its monikers. WeRateDogs uses a silly rating system for each dog where the denominator is 10 but the numerator is often 10-14 like in the example above with Willow receiving a '12/10' rating. Also he uses affectionate and funny monikers like 'doggo', 'pupper', 'floof-er', and 'puppo' to describe the picture of the dog based vaguely on their age or apparent wiseness, according to the dog dictionary. So ratings and monikers will work as a weak proxy for comedic appeal. The model I build with these features will be called the Comedic Impact Model.

Now how do we decide what is a 'good dog picture'? Well, some breeds may reasonably be considered cuter, sillier, or some other factor that makes that type of dog more popular. Therefore, it would be interesting to track the relationship between the dog breeds in the tweeted photo and popularity. Also since the monikers of 'doggo', 'puppo', 'pupper', and 'floofer' correlate with the aged-ness or fluffiness of the dog in the picture, we can use that to approximate what ages of dogs people like pictures of. I consider this model the Aesthetic Impact Model.

Feature Engineering¶

It is also important to operationalize 'popularity' in this analysis. In researching this topic, I found that engagement ratings defined on twitter as the sum of (retweets + favorites + comments + quote) / (followers) is often considered to be the most important metric for judging the popularity of a tweet. However, I didn't have access to comments count, quotes count, or follower count so had to settle with just summing retweets and favorites to assess popularity in what I called the 'Popularity Rating'. The main disadvantage this Popularity Rating has vs Engagement Rating is that the Popularity Rating doesn't scale the popularity of the tweet relative to the tweeter's follower count, which is very important if you are trying to judge the popularity of a tweet from an account that has 10 followers an account with 1 million followers. This means that this metric will favor tweets later in his portfolio (when WeRateDogs has more followers) than others.

To account for this time bias, I always included the amount of days passed (date_number) in all my multiple linear regressions. In effect, this lets us think of the predictive value of each independent predictor while holding the value of time passed as constant. A different way to think of this is that it lets us almost think of each post being given a 'baseline follower count' that steadily increases over time. Using time change though only approximates a monotonically increasing trend in follower count and will not be able to account for quick surges or drops in followers that may follow after a very popular or very unpopular post.

Methodology¶

The first part of this analysis will assess the impact of the comedic variables on popularity. To do this I will use a multiple linear regression model with the independent variables of ratings, monikers, and 'probability that the picture is actually of a dog' that tries to predict the target variable of popularity_rating of each tweet. The second part will use a similar multiple linear regression to assess the impact of picture quality as measured by different dog breeds, number of pictures, and monikers on popularity. We can then compared the two models to see which is a better predictor of popularity.

The final portion will bring all this together to try to predict the popularity of a tweet based on all the variables available. In this section, it will become evident that .... There are also a couple of other interesting insights into when and how Alex has tended to post tweets at the bottom. Cool! Let's get started.

Comedic Impact Model¶

Silly as it may seem, the comedic characteristics did have some ability to predict the popularity rating of the post. The multiple linear regression model had an R-squared of 0.371, meaning that the comedic variables (including number of days from account initiation) explained 37.1% of the variation in the popularity rating compared to just using the mean popularity. The model also had a f-statistic value of 9.31e-184, meaning that the model was statistically significant. Some independent variables however had a pretty high p-value, such as: probability of being a dog, probability of the picture being unambiguous, 'pupper', 'floofer', '10/10', '11/10', 'high-dogs'. This is evident in the graph below where the predicted values are not very close to the actual values. I did look at a model without any of these statistically insignificant variables, and besides modest improvements in model evaluation numbers related to degrees of freedom and overfitting, there was no emergent predictive improvements. This more efficient model didn't captured what I wanted to investigate about the data though, so I kept the original model with all the comedic variables.

image-2.png

Aesthetic Impact Model¶

The multiple linear regression model I made which only took in visual characteristics of the dogs, namely dog breed and dog moniker,

The multiple linear regression model I made which only took in aesthetic characteristics of the dogs, namely dog breed and dog moniker, did not perform as well as the Comedic Characteristic popularity rating model. The multiple linear regression model had an R-squared of 0.339, and considered statistically significant . Some independent variables however had a pretty high p-value, such as: probability of being a dog, probability of the picture being unambiguous, 'pupper', 'floofer', '10/10', '11/10', 'high-dogs'. This is evident in the graph below where the predicted values are not very close to the actual values. I did look at a model without any of these statistically insignificant variables, and besides modest improvements in model evaluation numbers related to degrees of freedom and overfitting, there was no emergent predictive improvements. This more efficient model didn't captured what I wanted to investigate about the data though, so I kept the original model with all the comedic variables.

image.png

image-2.png

What are the Significant Features?¶

image.png

Other Considerations: As time Goes on...¶

One insight that is hard to see via the graphs though is that date_number is actually a very strong predictor of popularity. A model that only used date_number as a predictor had an R-squared of 0.316 and is statistically significant, which is only slightly lower than the model with all the comedic variables (R-squared = 0.371), and almost equal to the model with the aesthetic variables (R-squared = 0.339). I put a scatter plot with a regression line to illustrate this relationship below.

This is likely because date number is the best proxy in the model for follower count, since that is something I was unable to retrieve from the tweets. This means that we should strongly consider the fact that there may be a popularity feedback loop (those who are popular get more popular), or that the author is learning how to be more popular over time in some other way.

On the topic of time, it would appear as if most of We_Rate_Dogs tweets originate from the hours of of 3pm to 6pm and 11pm to 2am --- wow! That's a lot of tweets after midnight! I wonder if the author is a night owl?

image.png

image.png